Active hearing, active speaking

نویسندگان

  • Martin Cooke
  • Yan-Chen Lu
  • Youyi Lu
  • Radu Horaud
چکیده

A static view of the world permeates most research in speech and hearing. In this idealised situation, sources don’t move and neither do listeners; the acoustic environment doesn’t change; and speakers speak without any effect of auditory input from their own voice or other speakers. Corpora for speech research and most behavioural tasks have grown to reflect the static viewpoint. Yet it is clear that speech and hearing takes place in a world where none of the static assumptions hold, or at least not for long. The dynamic view complicates traditional signal processing approaches, and renders conventional evaluation processes unrepeatable since the observer’s dynamics influence the signals received at the ears. However, the dynamic viewpoint also provides many opportunities for active processes to exploit. Some of these, such as the use of head movements to resolve front-back confusions, are well-known, while others exist solely as hypotheses. This paper reviews known and potential benefits of active processes in both hearing and speech production, and goes on to describe two recent studies which demonstrate the value of such processes. The first shows how dynamic cues can be used to estimate distance in an acoustic environment. The second demonstrates that the changes in speech production which take place when other speakers are active result in increased glimpsing opportunities at the ear of the interlocutor. INTRODUCTION The listening problem The classical account of the issues faced by listeners has been illustrated by the cocktail party problem (CPP): how do listeners manage to decipher speech in the presence of other sound sources, including competing talkers (Cherry, 1953)? The CPP has inspired both behavioural and computational studies which have focused on the use of cues such as fundamental frequency and interaural time differences, invoking principles such as old-plus-new and continuity to handle the introduction and tracking of new sources. The CPP has led to a focus on the existence of multiple sources and, in algorithmic terms, a welcome move away from the idea, prevalent in speech enhancement, that ‘noise’ is a quasi-stationary interference which can be suppressed. Several corpora based on the CPP now exist, and in a recent evaluation of computational techniques for identifying utterances in the presence of another talker, one approach achieved super-human performance in some conditions (Kristjansson et al., 2006). But is the CPP a reasonable description of the true listening problem? By focusing mainly on the idea of attending to a single source amongst multiple non-stationary sources, the CPP account has downplayed a key aspect of auditory scenes, namely their dynamics. Consider instead the following scenario. You arrive at an airport. Your route takes you through the wide, high check-in hall and then through narrow tunnel-like corridors lined with glass and steel to the more comfortably furnished departure lounge, then down even narrower corridors

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Rumor Processes On

We study four discrete-time stochastic systems on N, modeling processes of rumor spreading. The involved individuals can either have an active or a passive role, speaking up or asking for the rumor. The appetite for spreading or hearing the rumor is represented by a set of random variables whose distributions may depend on the individuals. Our goal is to understand—based on the distribution of ...

متن کامل

Round Window Application of an Active Middle Ear Implant: A Comparison With Hearing Aid Usage in Japan

OBJECTIVE To report on the safety and efficacy of an investigational active middle ear implant (AMEI) in Japan, and to compare results to preoperative results with a hearing aid. DESIGN Prospective study conducted in Japan in which 23 Japanese-speaking adults suffering from conductive or mixed hearing loss received a VIBRANT SOUNDBRIDGE with implantation at the round window. Postoperative thr...

متن کامل

On the Interrelationships among Undergraduate English Foreign Language Learners’ Speaking Ability, Personality Traits, and Learning Styles

The vital role individual differences, such as personality variation, play has long been discussed as the origin of different learning abilities. Accordingly, a cross-sectional survey and a descriptive study was conducted. Data was gathered from a sample of 150 students of both genders (107 females and 43 males) with an age range of 19-22. The translated and validated versions of the Big Five p...

متن کامل

Why do Swedish-speaking Finns have longer active life? An area for social capital research.

We performed ecological and individual register studies to compare disability-free life expectancies and disability pensions among Swedish-speaking and Finnish-speaking Finns residing on the western coast of Finland. The study was conducted to establish our assumption that the Swedish-speaking ethnic minority has a longer active life than the Finnish-speaking majority and to show that this disp...

متن کامل

Collocations in English

When learning English words, it is useful to also learn the words they often occur with .This combination of words that work together is known under the name of collocation. Learning collocations is an important part of acquiring the vocabulary of a language. A collocation refers to clusters of words that often go together throughout written and spoken English and form a common expression. An u...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007